26 research outputs found

    Caltech-UCSD Birds 200

    Get PDF
    Caltech-UCSD Birds 200 (CUB-200) is a challenging image dataset annotated with 200 bird species. It was created to enable the study of subordinate categorization, which is not possible with other popular datasets that focus on basic level categories (such as PASCAL VOC, Caltech-101, etc). The images were downloaded from the website Flickr and filtered by workers on Amazon Mechanical Turk. Each image is annotated with a bounding box, a rough bird segmentation, and a set of attribute labels

    MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

    Full text link
    In this work, we tackle the problem of instance segmentation, the task of simultaneously solving object detection and semantic segmentation. Towards this goal, we present a model, called MaskLab, which produces three outputs: box detection, semantic segmentation, and direction prediction. Building on top of the Faster-RCNN object detector, the predicted boxes provide accurate localization of object instances. Within each region of interest, MaskLab performs foreground/background segmentation by combining semantic and direction prediction. Semantic segmentation assists the model in distinguishing between objects of different semantic classes including background, while the direction prediction, estimating each pixel's direction towards its corresponding center, allows separating instances of the same semantic class. Moreover, we explore the effect of incorporating recent successful methods from both segmentation and detection (i.e. atrous convolution and hypercolumn). Our proposed model is evaluated on the COCO instance segmentation benchmark and shows comparable performance with other state-of-art models.Comment: 10 pages including referenc

    Uncertainty modeling in affective computing

    Get PDF
    This disclosure describes techniques that capture the uncertainty in machine-vision based affect (emotion) perception. The techniques are capable of predicting aleatoric, epistemic, and annotation uncertainty. Measures of uncertainty are important to safety-critical and subjective assessment tasks such as those found in the perception of affective expressions

    A face recognition system for assistive robots

    Get PDF
    Assistive robots collaborating with people demand strong Human-Robot interaction capabilities. In this way, recognizing the person the robot has to interact with is paramount to provide a personalized service and reach a satisfactory end-user experience. To this end, face recognition: a non-intrusive, automatic mechanism of identification using biometric identifiers from an user's face, has gained relevance in the recent years, as the advances in machine learning and the creation of huge public datasets have considerably improved the state-of-the-art performance. In this work we study different open-source implementations of the typical components of state-of-the-art face recognition pipelines, including face detection, feature extraction and classification, and propose a recognition system integrating the most suitable methods for their utilization in assistant robots. Concretely, for face detection we have considered MTCNN, OpenCV's DNN, and OpenPose, while for feature extraction we have analyzed InsightFace and Facenet. We have made public an implementation of the proposed recognition framework, ready to be used by any robot running the Robot Operating System (ROS). The methods in the spotlight have been compared in terms of accuracy and performance in common benchmark datasets, namely FDDB and LFW, to aid the choice of the final system implementation, which has been tested in a real robotic platform.This work is supported by the Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech, the research projects WISER ([DPI2017-84827-R]),funded by the Spanish Government, and financed by European RegionalDevelopment’s funds (FEDER), and MoveCare ([ICT-26-2016b-GA-732158]), funded by the European H2020 program, and by a postdoc contract from the I-PPIT-UMA program financed by the University of Málaga
    corecore